本文提出了一种估计条件平均治疗效果的新方法。它称为TNW-CATE(可训练的Nadaraya-Watson回归CATE),并且基于以下假设:控制数量相当大,而处理的数量很少。 TNW-CATE使用Nadaraya-Watson回归来预测对照组和治疗组的患者的结果。 TNW-CATE背后的主要思想是通过使用特定形式的重量分享神经网络来训练Nadaraya-Watson回归的内核。该网络在控件上进行了训练,并用一组具有共享参数的神经子网代替标准内核,使每个子网都实现了可训练的内核,但是整个网络都实现了Nadaraya-Watson估计器。网络记住特征向量如何位于特征空间中。当源和目标数据的域相似时,所提出的方法类似于传输学习,但任务不同。各种数值仿真实验说明了TNW-CATE,并将其与众所周知的T-Learner,S-Learner和X-Learner进行比较,以进行几种类型的对照和治疗结果函数。 https://github.com/stasychbr/tnw-cate提供了实施TNW-CATE的算法的代码。
translated by 谷歌翻译
提出了使用注意力和自我发项机制共同解决回归问题的新模型。这些模型可以被视为基于注意力的随机森林的扩展,其思想源于将Nadaraya-Watson内核回归和Huber污染模型的组合应用于随机森林。自我发作旨在捕获树木预测的依赖性,并消除随机森林中的噪声或异常预测。自我发场模块与注意力重量的注意模块共同训练。结果表明,注意力重量的训练过程减少到解决单个二次或线性优化问题。提出并比较了一般方法的三个修改。还考虑了对随机森林的特定多头自我注意。自我注意事项的头部是通过更改其调谐参数(包括内核参数和模型的污染参数)来获得的。使用各种数据集的数值实验说明了所提出的模型,并表明自我发挥的补充可改善许多数据集的模型性能。
translated by 谷歌翻译
提出了一种称为ABRF(基于关注的随机林)的新方法及其用于将注意机制应用于回归和分类的随机林(RF)的修改。拟议的ABRF模型背后的主要观点是以特定方式将注意力与可培训参数分配给决策树。权重取决于实例之间的距离,其落入树的相应叶子,以及落入同一叶子的情况。这种想法源于Nadaraya-Watson内核回归以RF的形式表示。提出了三种改进的一般方法。第一个基于应用Huber的污染模型,并通过解决二次或线性优化问题来计算注意力。第二个和第三种修改使用基于梯度的算法来计算可训练参数。各种回归和分类数据集的数值实验说明了所提出的方法。
translated by 谷歌翻译
提出了一种新的基于多关注的MIL问题(MIMIL)的方法,其考虑了袋子中的每个分析的贴片的邻近补丁或情况。在该方法中,关注模块之一考虑了相邻的补丁或实例,使用了几个注意力模块来获取各种特征表示的补丁,并且一个注意模块用于组合不同的特征表示,以提供每个补丁的准确分类(实例)和整袋。由于妈妈,实现了以小规模的嵌入形式的斑块和邻居的组合表示,用于简单分类。此外,实现了不同类型的贴片,并有效地处理了通过使用几种注意力模块的袋中贴片的不同特征表示。提出了一种简单的解释贴片分类预测的方法。各种数据集的数值实验说明了所提出的方法。
translated by 谷歌翻译
提出了一个新的基于注意力的升压机(GBM)的模型,称为AgBoost(基于注意力的梯度提升),以解决回归问题。拟议的AGBOOST模型背后的主要思想是将带有可训练参数的注意力分配给GBM的迭代,条件是决策树是GBM中的基础学习者。注意力的重量是通过应用决策树的特性和使用Huber的污染模型来确定的,该模型在注意力的参数和注意力重量之间提供了有趣的线性依赖性。这种特殊性使我们能够通过线性约束解决标准二次优化问题来训练注意力权重。注意力重量还取决于折现因子作为调整参数,这决定了重量的影响随迭代次数减少的程度。对两种类型的基础学习者,原始决策树和具有各种回归数据集的极为随机树进行的数值实验说明了所提出的模型。
translated by 谷歌翻译
In this paper we explore the task of modeling (semi) structured object sequences; in particular we focus our attention on the problem of developing a structure-aware input representation for such sequences. In such sequences, we assume that each structured object is represented by a set of key-value pairs which encode the attributes of the structured object. Given a universe of keys, a sequence of structured objects can then be viewed as an evolution of the values for each key, over time. We encode and construct a sequential representation using the values for a particular key (Temporal Value Modeling - TVM) and then self-attend over the set of key-conditioned value sequences to a create a representation of the structured object sequence (Key Aggregation - KA). We pre-train and fine-tune the two components independently and present an innovative training schedule that interleaves the training of both modules with shared attention heads. We find that this iterative two part-training results in better performance than a unified network with hierarchical encoding as well as over, other methods that use a {\em record-view} representation of the sequence \cite{de2021transformers4rec} or a simple {\em flattened} representation of the sequence. We conduct experiments using real-world data to demonstrate the advantage of interleaving TVM-KA on multiple tasks and detailed ablation studies motivating our modeling choices. We find that our approach performs better than flattening sequence objects and also allows us to operate on significantly larger sequences than existing methods.
translated by 谷歌翻译
Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subsampled OCT data and more recently, deep-learning-based methods have been explored. In this study, we simulate reduced axial scan (A-scan) resolution by Gaussian windowing in the spectral domain and investigate the use of a learning-based approach for image feature reconstruction. In anticipation of the reduced resolution that accompanies wide-field OCT systems, we build upon super-resolution techniques to explore methods to better aid clinicians in their decision-making to improve patient outcomes, by reconstructing lost features using a pixel-to-pixel approach with an altered super-resolution generative adversarial network (SRGAN) architecture.
translated by 谷歌翻译
Real-life tools for decision-making in many critical domains are based on ranking results. With the increasing awareness of algorithmic fairness, recent works have presented measures for fairness in ranking. Many of those definitions consider the representation of different ``protected groups'', in the top-$k$ ranked items, for any reasonable $k$. Given the protected groups, confirming algorithmic fairness is a simple task. However, the groups' definitions may be unknown in advance. In this paper, we study the problem of detecting groups with biased representation in the top-$k$ ranked items, eliminating the need to pre-define protected groups. The number of such groups possible can be exponential, making the problem hard. We propose efficient search algorithms for two different fairness measures: global representation bounds, and proportional representation. Then we propose a method to explain the bias in the representations of groups utilizing the notion of Shapley values. We conclude with an experimental study, showing the scalability of our approach and demonstrating the usefulness of the proposed algorithms.
translated by 谷歌翻译
The previous fine-grained datasets mainly focus on classification and are often captured in a controlled setup, with the camera focusing on the objects. We introduce the first Fine-Grained Vehicle Detection (FGVD) dataset in the wild, captured from a moving camera mounted on a car. It contains 5502 scene images with 210 unique fine-grained labels of multiple vehicle types organized in a three-level hierarchy. While previous classification datasets also include makes for different kinds of cars, the FGVD dataset introduces new class labels for categorizing two-wheelers, autorickshaws, and trucks. The FGVD dataset is challenging as it has vehicles in complex traffic scenarios with intra-class and inter-class variations in types, scale, pose, occlusion, and lighting conditions. The current object detectors like yolov5 and faster RCNN perform poorly on our dataset due to a lack of hierarchical modeling. Along with providing baseline results for existing object detectors on FGVD Dataset, we also present the results of a combination of an existing detector and the recent Hierarchical Residual Network (HRN) classifier for the FGVD task. Finally, we show that FGVD vehicle images are the most challenging to classify among the fine-grained datasets.
translated by 谷歌翻译
Three main points: 1. Data Science (DS) will be increasingly important to heliophysics; 2. Methods of heliophysics science discovery will continually evolve, requiring the use of learning technologies [e.g., machine learning (ML)] that are applied rigorously and that are capable of supporting discovery; and 3. To grow with the pace of data, technology, and workforce changes, heliophysics requires a new approach to the representation of knowledge.
translated by 谷歌翻译